Quantitative Analysis of F0-induced Variations of Cepstrum Coefficients

نویسندگان

  • Nobuaki MINEMATSU
  • Keiichi TSUDA
  • Keikichi HIROSE
چکیده

In this paper, the correlation between cepstrum coefficients and fundamental frequencies (F0) is quantitatively analyzed. One of our previous studies pointed out that cepstrum coefficients of vowel sounds are varied because of F0 changes and that the variation can be modeled by the multivariate regression analysis. After this previous study, the current work is focused upon the analysis of the correlation in voiced consonant sounds, that in unvoiced consonant sounds, and the dependency of the correlation on speakers/phonemes. After these analyses, several experiments are carried out to examine whether the models built for characterizing the correlation can be used for speech recognition or not. Results show that the distance between distributions of two similar phones, such as /s/ and /z/, and /m/ and /n/, is significantly increased by applying the models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling of variations in cepstral coefficients caused by F0 changes and its application to speech processing

In this paper, the correlation between spectral variations and F0 changes in a vowel sound is rstly analyzed, where the variations are also compared to VQ distortions calculated in a ve-vowel space. It is shown that the F0 change approximately by a half octave produces the spectral variation comparable to the averaged VQ distortion when the codebook size is the number of the vowels. Next, a mod...

متن کامل

Prediction of Epileptic Seizures in Patients with Temporal Lobe Epilepsy (TLE) based on Cepstrum analysis and AR model of EEG signal

Epilepsy is a chronic disorder of brain function caused by abnormal and excessive electrical neurons discharge in the brain. Seizures cause disturbances in consciousness that occur without prior notice, so their prediction ability, based on EEG data, can reduce stress and improve quality of life. An epileptic patient EEG data consists of five parts: Ictal, Inter-Ictal, pre-Ictal, Post-Ictal, an...

متن کامل

Improving the performance of HMM-based very low bit rate speech coding

In this paper, we define an F0 quantization scheme for a very low bit rate speech coder based on HMM (Hidden Markov Model). In the coding system, the encoder carries out phoneme recognition, and transmits phoneme indices, state durations and F0 information to the decoder. In the decoder, phoneme HMMs are concatenated according to the phoneme indices, and a sequence of mel-cepstral coefficient v...

متن کامل

Estimation of fundamental frequency of reverberant speech by utilizing complex cepstrum analysis

This paper reports comparative evaluations of twelve typical methods of estimating fundamental frequency (F0) over huge speech-sound datasets in artificial reverberant environments. They involve several classic algorithms such as Cepstrum, AMDF, LPC, and modified autocorrelation algorithms. Other methods involve a few modern instantaneous amplitudeand/or frequency-based algorithms, such as STRA...

متن کامل

Highly Accurate Mandarin Tone Classification In The Absence of Pitch Information

A deep neural network (DNN) classifier based only on 40 mel-frequency cepstral coefficients (MFCCs) achieved 29.99% frame error rate (FER) and 16.86% segment error rate (SER) in recognizing five tonal categories in Mandarin Chinese broadcast news. With the addition of subband autocorrelation change detection (SACD) pitch-class features [1], the classifier scored 27.58% FER and 15.56% SER. These...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001